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ABSTRACT: Human coronavirus NL63 (HCoV-NL63) has recently been identified as a causative agent of 
acute respiratory tract illnesses in infants and young children. The HCoV-NL63 spike (S) protein mediates 
virion attachment to cells and subsequent fusion of the viral and cellular membranes. This viral entry 
process is a primary target for vaccine and drug development. HCoV-NL63 S is expressed as a single- 
chain glycoprotein and consists of an N-terminal receptor-binding domain (S1) and a C-terminal 
transmembrane fusion domain (S2). The latter contains two highly conserved heptad-repeat (HR) sequences 
that are each extended by 14 amino acids relative to those of the SARS coronavirus or the prototypic 
murine coronavirus, mouse hepatitis virus. Limited proteolysis studies of the HCoV-NL63 S2 fusion core 
identify an a-helical domain composed of a trimer of the HR segments N57 and C42. The crystal structure 
of this complex reveals three C42 helices entwined in an oblique and antiparallel manner around a central 
triple-stranded coiled coil formed by three N57 helices. The overall geometry comprises distinctive high- 
affinity conformations of interacting cross-sectional layers of the six helices. As a result, this structure is 
unusually stable, with an apparent melting temperature of 78 °C in the presence of the denaturant guanidine 
hydrochloride at 5 M concentration. The extended HR regions may therefore be required to prime the 
group | S glycoproteins for their fusion-activating conformational changes during viral entry. Our results 
provide an initial basis for understanding an intriguing interplay between the presence or absence of 
proteolytic maturation among the coronavirus groups and the membrane fusion activity of their S 
glycoproteins. This study also suggests a potential strategy for the development of improved HCoV- 
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NL63 fusion inhibitors. 


The coronavirus family of enveloped positive-stranded 
RNA viruses includes significant human and animal patho- 
gens. The newly identified SARS! coronavirus (SARS-CoV) 
was found to be the etiological agent of the 2002—2003 
epidemic of severe acute respiratory syndrome (SARS) that 
killed nearly 10% of infected individuals (J, 2). Human 
coronavirus 229E (HCoV-229E) and human coronavirus 
OC43 (HCoV-OC43) are endemic and responsible for up to 
30% of mild upper respiratory tract infections (3, 4). The 
human coronavirus NL63 (HCoV-NL63) has recently been 
associated with conjunctivitis, croup, and acute respiratory 
infections in children and immunocompromised patients (5— 
&). On the basis of genetic and serological relationships (3, 
9), HCoV-NL63 is closely related to the group | corona- 
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viruses that include HCoV-229E and animal coronaviruses 
like porcine epidemic diarrhea virus, transmissible gastro- 
enteritis virus, and feline infectious peritonitis virus, whereas 
HCoV-OC43 is a group 2 virus, related to animal corona- 
viruses like mouse hepatitis virus (MHV), porcine hemag- 
glutinining encephalomyelitis virus, and bovine coronavirus. 
SARS-CoV and SARS-CoV-like viruses found in animals 
are outliers of group 2 and have been assigned to group 4 
(10-12). The factors that influence the evolution and 
pathogenicity of coronaviruses remain largely unknown. 
Recent work suggests that the species tropism and virulence 
of a specific coronavirus are largely determined by its spike 
(S) glycoprotein (J3—19). The S proteins mediate virion 
attachment to cells and subsequent fusion of the virus and 
cell membranes and are the major antigenic determinants of 
coronaviruses (20, 2/). 

Coronavirus S proteins are synthesized as single polypep- 
tide chains that oligomerize in the endoplasmic reticulum 
and are processed through the Golgi, eventually forming 
long, petal-shaped spikes that protrude from the virion 
surface (20). By analogy with a large group of so-called class 
I fusion proteins, coronavirus S proteins consist of an 
N-terminal receptor-binding domain (S1) and a C-terminal 
transmembrane fusion domain (S2) (27—23). In some cases 
such as MHV and HCoV-OC43, the glycoprotein precursor 
requires proteolytic cleavage to generate the mature S1 and 
S2 chains by a furin-like protease in the producer cell (22, 
24—27). By contrast, the S proteins of SARS-CoV, HCoV- 
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Ficure 1: Interactions of two heptad-repeat regions in HCoV-NL63 S2. (a) Schematic representation of $2. The positions of the fusion 
peptide (FP), the two hydrophobic heptad repeats (HRy and HR,), the transmembrane region (TM), and the cytoplasmic region (CP) are 
shown. (Note that the exact location of the fusion peptide is not known.) Residues are numbered according to their position in the HCoV- 
NL63 S sequence. (b) Sequence alignment of the HRy (top) and HRc (bottom) regions derived from coronavirus S2 proteins. The sequences 
are shown for HCoV-NL63 (AAS58177), HCoV-229E (AAK32191), CCoV (canine coronavirus; AAQ17220), FIPV (feline infectious 
peritonitis virus; BAA06805), PEDV (porcine epidemic diarrhea virus; AAT75298), PRCoV (porcine respiratory coronavirus; CAA42686), 
HCoV-OC43 (S44241), SARS-CoV (AAP13441), IBV (avian infectious bronchitis virus; AAO34396), and MHV (mouse hepatitis virus; 
P11225). The locations of the N57 and C42 peptides from HCoV-NL63 S2 are indicated. Buried core residue positions of the N57/C42 
complex are shown (see text and Figures 4 and 5). Chemically similar residues among coronaviruses are colored red. 


229K, and HCoV-NL63 lack furin recognition sites and are 
found uncleaved on the virion surface (28—30). Host-cell 
receptors for several human coronaviruses have been identi- 
fied. For example, SARS-CoV and HCoV-NL63 employ 
angiotensin-converting enzyme 2 (ACE2) as a receptor (31, 
32), and HCoV-229E engages aminopeptidase N (CD13) (33, 
34). Following receptor binding, coronaviruses are endo- 
cytosed and enter cells by pH-dependent fusion in the 
endosome (31, 35—37). In addition, coronavirus S proteins 
can mediate cell—cell fusion at neutral pH (24, 25, 32, 35, 
38, 39). Although the activation triggers appear to differ in 
virus—cell and cell—cell fusion, in both cases the S2 domain 
undergoes large-scale conformational changes in order to 
bring two membranes into close apposition and induce fusion, 
a process that results in the release of the virion core into 
the cell cytoplasm (40—44). 

Coronavirus S2 proteins share several characteristic fea- 
tures with other known class I fusion proteins, including 
influenza virus HA» and HIV-1 gp41. A hydrophobic fusion 
peptide, which is exposed and inserted into the target-cell 
membrane during the fusion process (4/), is presumably 
located at the N-terminal region of S2 (Figure la). S2 is a 
type I integral membrane protein with a single transmem- 
brane domain. Flanking both the fusion peptide and trans- 
membrane domains are two highly conserved regions 
consisting of heptad repeats (HR) of hydrophobic residues 
characteristic of coiled coils. These regions are denoted HRn 
and HRc¢, respectively, and are separated by ~250 interven- 
ing amino acids. Structural studies of soluble S2 cores from 
SARS-CoV and MHV showed that the HRy and HRc 
segments associate to form a highly stable six-helix bundle 
(22, 45—52). Three HRc helices pack in the antiparallel 
manner into three hydrophobic grooves on the surface of an 
interior trimeric coiled coil formed by three HRy helices. 


This six-helix bundle is a well-known structural motif of class 
I fusion proteins and may in fact be the final fusogenic form 
of S2 (40—43). Current thinking postulates that formation 
of the six-helix bundle is mechanistically and thermodynami- 
cally linked to merging of the viral and cellular membranes 
(52-54). 

Peptides corresponding to the HRc region of S2 can inhibit 
SARS-CoV and HCoV-NL63 infection at micromolar con- 
centrations (50, 55, 56). Analogous to the HIV-1 gp41 
peptide fusion inhibitors (40), these S2 HRc peptides have 
been proposed to act in a dominant—negative manner to 
interfere with six-helix bundle formation, thereby inhibiting 
coronaviral entry. Interestingly, there are 14-residue in-phase 
insertions in both HR sequences of all known group 1 
coronaviruses including HCoV-229E and HCoV-NL63 (Fig- 
ure Ib). It remains unclear how these sequence elements 
affect the folding to, or stability of, the fusogenic conforma- 
tion of $2. A detailed mechanistic understanding of interac- 
tions between the HR regions in a group | virus should help 
clarify the mechanism of membrane fusion mediated by the 
S protein and could assist antiviral drug and vaccine 
development. Here we have used a protein-dissection ap- 
proach to identify and determine the X-ray crystal structure 
of a proteolytically resistant S2 core from HCoV-NL63, and 
we discuss the implications of this structure for coronaviral 
membrane fusion and its inhibition. 


MATERIALS AND METHODS 


Protein Expression, Purification, and Proteolysis. The 
HCoV-NL63 S2 HRn/HRc segments (representing residues 
981—1046 and 1237—1286 of full-length S; Figure 1a) were 
cloned into the pET24a vector (Novagen) to generate pN66/ 
C50 using standard molecular biology techniques. Plasmid 
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pN66(L6)C50 was derived from pN66/C50 by the insertion 
of a Ser-Gly-Gly-Arg-Gly-Gly sequence between the C 
terminus of the N66 segment and the N terminus of the C50 
segment. The N57(L10)C42 protein consists of residues 
981—1037 and 1242—1283 connected by the linker residues 
Ser-Gly-Gly-Arg-Gly-Ser-Gly-Arg-Gly-Gly. All constructs 
were appended to the TrpLE’ leader sequence (57). The N66- 
(L6)C50 and N57(L10)C42 proteins were expressed in 
Escherichia coli BL21 (DE3)/pLyssS, purified from inclusion 
bodies, and cleaved from the TrpLE’ leader sequence with 
cyanogen bromide as described (58). All peptide proteins 
were purified to homogeneity by reverse-phase HPLC 
(Waters, Inc.) on a Vydac C18 preparative column (Hesperia, 
CA) using a water—acetonitrile gradient in the presence of 
0.1% trifluoroacetic acid and lyophilized. Protein identities 
were confirmed by electrospray mass spectrometry (PerSep- 
tive Biosystems Voyager Elite, Cambridge, MA). Proteinase 
K digestion was performed at protease/protein ratios of 1:200 
(wt/wt) and room temperature in 50 mM Tris—HCl (pH 8.0). 
Proteolytic fragments were analyzed by reverse-phase HPLC 
and assigned by N-terminal sequencing and mass spectrom- 
etry. Protein concentrations were determined by using the 
method of Edelhoch (59). 

Circular Dichroism Spectroscopy. CD experiments were 
performed on an Aviv 62A/DS (Aviv Associates, Lakewood, 
NJ) spectropolarimeter equipped with a thermoelectric tem- 
perature control in TBS (50 mM Tris—HCl, pH 8.0, 150 mM 
NaCl) and 50 uM protein. CD spectra were collected from 
260 to 200 nm at 4 °C, using an average time of 5 s, a cell 
path length of 0.1 cm, and a bandwidth of 1 nm. A [@]222 
value of —35 000 deg cm? dmol™! was taken to correspond 
to 100% helix (60). Thermal stability was determined by 
monitoring [9]2.2 as a function of temperature in TBS (pH 
8.0) and with the addition of 5 M guanidine hydrochloride 
(GuHCl) to facilitate unfolding. Thermal melts were per- 
formed in 2° intervals with a 2 min equilibration at the 
desired temperature and an integration time of 30 s. Revers- 
ibility was verified by repeated scans. Superimposable 
folding and unfolding curves were observed, and >95% of 
the signal was regained upon cooling. Values of midpoint 
unfolding transitions (T;,) were estimated by evaluating the 
maximum of the first derivative of [@]2.2 versus temperature 
data (6/). 

Sedimentation Equilibrium Analysis. Analytical ultracen- 
trifugation measurements were carried out on a Beckman 
XL-A (Beckman Coulter) analytical ultracentrifuge equipped 
with an An-60 Ti rotor (Beckman Coulter) at 20 °C. Protein 
samples were dialyzed overnight against TBS (pH 8.0), 
loaded at initial concentrations of 25, 100, and 400 uM, and 
analyzed at rotor speeds of 10, 16, and 19 krpm. Data were 
acquired at two wavelengths per rotor speed setting and 
processed simultaneously with a nonlinear least-squares 
fitting routine (62). Solvent density and protein partial 
specific volume were calculated according to solvent and 
protein composition, respectively (63). Nonrandom residuals, 
indicative of aggregation or derivation from ideality, were 
observed for N66(L6)C50. The apparent molecular mass of 
N57(L10)C42 was within 10% of that calculated for an ideal 
trimer, with no systematic deviation of the residuals. 

Crystallization and Data Collection. HPLC-purified N57- 
(L10)C42 was solubilized in 6 M GuHCl and 50 mM Tris— 
HCI (pH 8.0) and refolded by dilution into TBS (pH 8.0). 
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Table 1: Summary of Crystallographic Analysis 


Data Collection 


resolution (A) 51.6—1.75 
unique reflections 25668 
redundancy 3:2-(3.1)% 
completeness (%) 95.6 (98.1) 
Rmnerge (%)? 6.8 (36.3) 
Ilo) 10.7 (3.7) 
space group P2, 


a=50.0A,b=51.7A, 


unit cell parameter A 
c= 54.1 A, 6 = 107.6° 


molecules in AU 3 
solvent content (%) 32:2, 
Refinement 
resolution (A) 51.6-1.75 
reflections 24376 
Raryst (%)° 19.6 
Ryree (%)° 24.0 
protein atoms 2009 
water molecules 228 
sodium ions 3 
acetic ion 1 


rms Deviations from Ideal Geometry 
bond lengths (A) 0.02 


bond angles (deg) 1.5 
torsion angles (deg) 4.5 
B-values (A”) 2.6 


“ Values in parentheses refer to the highest-resolution shell 1.75— 
1.81 A. ° Rmerge = X|J — (J)|/ZI, where J is the integrated intensity of 
a given reflection. ° Rays = 2|Fo — F.|/2Fo. Riree = Reryst calculated 
using 5% of the reflection data chosen randomly and omitted from the 
start of refinement. 


The protein was repurified by size exclusion on a Superdex 
200 column equilibrated with TBS (pH 8.0), exchanged into 
10 mM Tris—HCl (pH 8.0), and concentrated to 20 mg/mL 
by ultrafiltration. N57(L10)C42 was crystallized using the 
hanging drop vapor diffusion method by equilibrating against 
reservoir buffer (0.7 M sodium acetate, 0.2 M imidazole 
buffer, pH 5.15), a solution containing 1 wL of 6 mg/mL 
protein and 1| wL of reservoir buffer. Crystals belong to space 
group P2, (a= 50.0 A, b= 51.7 A, c = 54.1 A, B = 107.6°) 
and contain three monomers in the asymmetric unit, with a 
solvent content of 32.2% (Table 1). The crystals were 
harvested in 0.6 M sodium acetate, 0.2 M imidazole buffer, 
pH 5.15, 25% PEG 400 and frozen in liquid nitrogen. 
Diffraction data were recorded at 100 K on a MAR345 image 
plate at the beamline X4C of the National Synchrotron Light 
Source at Brookhaven National Laboratory. The images were 
indexed and integrated using a monoclinic unit cell with the 
program DENZO (64). The intensities were scaled in P2 
symmetry wih the program SCALEPACK (64); the system- 
atic absence of intensities indicates a 2-fold screw axis. 
Structure Determination. Initial phases were determined 
by molecular replacement with the program Phaser (65) using 
the structure of the SARS-CoV N50/C36 trimer (1ZV8) as 
a search model. Three N56/C36 molecules were oriented and 
placed in the asymmetric unit with a Z score of 7.6 and a 
final refined LLG of 227. In order to remove model bias, 
this model and the dataset for N57(L10)C42 were directly 
fed to the program Arp/Warp (66), which allowed ~71% of 
the final model to be automatically traced. The resulting 
experimental electron density map was of excellent quality 
and showed the location of most of the side chains. Although 
some electron density was evident for the linker loop region, 
no model could be built into it. Density interpretation and 
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manual model building were done with the program O (67). 
Crystallographic refinement of the model was carried out 
by using Refmac (68), resulting in the Rte of 29.0% and 
the Rwox of 24.5% between 51.6 and 1.75 A resolution. At 
this stage, three sodium ions (near the N termini of the C42 
helices), one acetate ion, and water molecules were modeled 
in the electron density. The three tentatively assigned metal 
atoms are 8.6, 6.5, and 4.4 o electron density peaks (in 
difference maps calculated with phases from the refined 
model) and refined to B-factors of 22.8, 22.4, and 28.8 AZ, 
respectively. These peaks are hexahedrally coordinated in 
the crystal by the five oxygen or nitrogen atoms of the protein 
and one water molecule, characteristic of metal ions. Refine- 
ment was concluded using Refmac (68) with TLS groups 
assigned for each N57 or C42 monomer (69). The final 
model (Reyst = 19.6% and Riree = 24.0% for the resolution 
range 51.6—1.75 A) consists of residues 982—1031 and 
1243-1280 (monomer A), 981—1032 and 1243—1279 
(monomer B), and 981—1032 and 1244—1279 (monomer 
C) in the asymmetric unit, three sodium ions, one acetate 
ion, and 228 water molecules. Bond lengths and bond angles 
of the model have root-mean-square (rms) deviations from 
ideality of 0.02 A and 1.5°, respectively. All protein residues 
are in the most favored regions of the Ramachandran plot. 


Structure Analysis. Coiled-coil parameters were calculated 
by fitting the C, backbones to a supercoil parametrization 
suggested by Crick (70). The local pitch values as a function 
of residue number were obtained with TWISTER (7/). The 
rms deviations were calculated with LSQKAB in the CCP4i 
program suite (72). Buried surface areas were calculated from 
the difference of the accessible side-chain surface areas of 
the six-helix bundle structure and of the individual helical 
monomers by using CNS 1.0 (73). Figures were generated 
with SETOR (74), Insight II (Accelrys), and GRASP (75). 


RESULTS AND DISCUSSION 


Identification of the N57/C42 Complex. To investigate the 
interaction between the two predicted HR regions in HCoV- 
NL63 S2 (Figure 1a), we constructed a recombinant protein, 
denoted N66(L6)CS0, in which residues 981—1046 and 
1237—1286 are covalently tethered by a short flexible linker 
sequence. N66(L6)C50 was produced by bacterial expression, 
purified by reverse-phase HPLC, and refolded by renaturation 
from GuHCl (see Materials and Methods). Circular dichroism 
(CD) spectroscopy indicates that N66(L6)CS50 forms an 
extremely stable helical structure that does not unfold on 
heating to 98 °C in TBS (pH 8.0) (data not shown). On the 
basis of the mean residue ellipticity at 222 nm at 4 °C and 
50 uM protein concentration, we estimate that 90 residues 
(~75% helix content) are in a-helical conformation. How- 
ever, sedimentation equilibrium measurements indicate that 
N66(L6)C50 does not form a monodisperse species and 
exhibits an apparent molecular mass ranging ~45—180 kDa 
as a total protein concentration increases from 25 to 400 uM. 
Hence, the recombinant protein construct associates to form 
higher-order complexes. 


To trim unfolded regions that potentially contribute to the 
aggregation, N66(L6)C50 was subjected to limited proteoly- 
sis by proteinase K. This digestion generates two proteolytic 
fragments. An N-terminal fragment corresponds to residues 
981—1037 (N57), and a C-terminal fragment corresponds 
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FiGuRE 2: Solution properties of N57(L10)C42. (a) Circular 
dichroism spectrum at 4 °C in TBS (pH 8.0) and 50 uM protein 
concentration. (b) Thermal melt monitored by CD at 222 nm. The 
filled circles show data collected in the presence of 5 M GuHCl, a 
chemical denaturant. (c) Sedimentation equilibrium data for a 100 
HM sample at 20 °C and 16 krpm in TBS (pH 8.0). The deviation 
in the data from the linear fit for a trimeric model is plotted (upper). 


to residues 1242—1283 (C42) (Figure 1b). Because protein- 
ase K is not sequence specific, the proteolytic fragments N57 
and C42 may more accurately define the a-helical domain 
structure. Accordingly, we adopted the peptides N57 and C42 
for further study. We produced a bacterially expressed single- 
chain model, designated N57(L10)C42, for the N57/C42 
complex. In this construct, the two helical segments are 
connected via the linker Ser-Gly-Gly-Arg-Gly-Ser-Gly-Arg- 
Gly-Gly. 

Solution Properties of the N57/C42 Complex. The N57- 
(L10)C42 protein contains >90% helical structure at 4 °C 
and 50 uM protein concentration in TBS (pH 8.0), as judged 
by CD studies (Figure 2a). Under these conditions, N57- 
(L10)C42 has a thermal stability that exceeds 100 °C and 
unfolds cooperatively and reversibly with a midpoint thermal 
denaturation (Tm) of 78 °C in the presence of 5 M GuHCl 
(Figure 2b). Over a 16-fold range of protein concentrations, 
the observed molecular mass of N57(L10)C42 is 36.9 kDa 
in TBS (pH 8.0), as determined by sedimentation equilibrium 
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FicurE 3: Crystal structure of the HCoV-NL63 N57(L10)C42 
complex. (a) Lateral view of the N57(L10)C42 trimer. The N57 
helices are shown in red and the C42 helices in green. The N termini 
of the N57 helices point toward the top of the page, and those of 
the C42 helices point toward the bottom. Residues are numbered 
according to their position in the HCoV-NL63 S sequence. (b) Axial 
view of the N57(L10)C42 trimer. The view is from the N termini 
of the N57 helices looking down the 3-fold axis of the trimer. 


experiments (Figure 2c). This value, compared with the 
expected molecular mass of 35.1 kDa for a trimer, indicates 
that N57(L10)C42 exists in a discretely trimeric state in 
solution. Thus, the extra-terminal residues in N66 and C42 
trimmed by proteinase K arguably contribute to aggregation 
of N66(L6)CS50. In summary, the slightly smaller N57(L10)- 
C42 domain forms an exceedingly stable, soluble trimeric- 
helical structure. 

Crystal Structure of N57(L10)C42. The X-ray crystal 
structure of N57(L10)C42 was determined at 1.75 A resolu- 
tion by molecular replacement using the six-helix structure 
of the SARS-CoV N50/C36 complex as a search model (see 
Materials and Methods). The final experimental electron 
density map is of good quality and reveals the positions of 
all of the amino acid residues except for a few disordered 
ones at the chain termini and in the interhelical linker. The 
refined model has an R-factor of 19.6% and a free R-value 
of 24.0%. Data collection and refinement statistics are 
summarized in Table 1. Despite asymmetric crystal contacts, 
the three individual chains in the N57(L10)C42 structure 
have essentially the same conformation and degree of order 
with the average B-factors varying from 21 to 23 A?. (Note 
that the connection of an N57 helix to a C42 helix is 
disordered and not visible in the current structure.) The three 
N57 and three C42 helices in the noncrystallographic trimer 
can be superimposed on each other with a rms deviation for 
the C, atoms of 0.2—0.3 A and with the largest deviations 
occurring at their ends. 

As anticipated, N57(L10)C42 is a six-stranded helical 
bundle (Figure 3). An approximate 3-fold axis of symmetry 
coincides with the superhelical axis. Each polypeptide chain 
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has an a-helical hairpin conformation (note that the 10- 
residue peptide linker region is not visible in the electron 
density maps and therefore must be disordered). The N57 
helices form an interior, parallel trimeric coiled coil. Three 
C42 helices pack in a left-handed and antiparallel direction 
into hydrophobic grooves on the surface of this coiled-coil 
trimer. The six-helix bundle forms an overall rod-shaped 
structure approximately 78 Ain length with a maximum 
diameter of 28 A. The N terminus of N57 and the C terminus 
of C42 are oriented at the same end of the rod-shaped 
structure (Figure 3a). The N-terminal end of C42 is ~14 A 
from the C-terminal end of the N57 trimer. This packing 
arrangement would serve to oppose the fusion peptides and 
transmembranes anchors, and therefore the attached target 
and viral membranes, into close proximity. 

Core Packing in the N57 Parallel Three-Stranded Coiled 
Coil. The N57 coiled-coil core includes approximately 48 
residues (984—1031) from each chain (the most N-terminal 
residue and the five most C-terminal residues cannot be seen 
in the electron density maps). Fourteen hydrophobic and three 
polar residues from each N57 peptide are packed in layers 
at the coiled-coil interface (Figure 4a). These core amino 
acids can be grouped into seven heptad repeats (Figure 4f). 
The last two of these repeats exhibit canonical knobs-into- 
holes packing, in which the side-chains of the a and d 
residues in one a-helix point directly into the hole formed 
between the side-chains of four residues in an adjacent helix 
(Figure 4b,c). The angles between the Cy—C, and Cy—Cg 
vectors at the a and d layers are 120° and 150°, respectively. 
This acute packing geometry is characteristic of parallel 
trimeric coiled coils (70, 76—78). In contrast, the first five 
heptad repeats of N57 lack any regular 3—4 hydrophobic 
periodicity. Instead, cross-sectional layers containing an “x- 
like” symmetric pattern (Figure 4d) (79) alternate with layers 
containing a da-like two-residue pattern (we refer to this as 
“y-like” packing by analogy; see Figure 4e) (52). Side chains 
at the x position project simultaneously toward the center of 
the hydrophobic core (Figure 4d), and similarly alternating 
small and large side chains at the y positions pack in a 
hexagonal arrangement (Figure 4e). All these core side chains 
(excluding alanine) adopt their well-populated rotamer 
conformations in a-helices (80). 

Because a knobs-into-knobs packing core at the x position 
can cause underwinding and bending of the helix, y layers 
often need to compensate for the existence of x layers in the 
coiled-coil structure (79). This type of packing geometry also 
tends to have small side chains at the first position of the yy 
residues in order to preclude clashes (8/). The structural 
result of this x—y—y sequence motif is exactly what we 
observed in the N57 trimeric coiled coil (Figure 4f): [e986- 
Ser989-Phe990, Val993-Ala996-Ile997, Thr1000-Ala1003- 
Tle 1004, and Val1007-Alal1010-Leu1011. In contrast to the 
C-terminal part of the N57 coiled coil that has a superhelical 
pitch value of 138 A, the pitch of the N-terminal x—y—y 
segment is 214 A. Increasing the pitch of the supercoil (i.e., 
underwinding) gives rise to specific non-close-packed layers 
at the sites of the heptad phase shifts (Figure 4d,e), as 
compared with close knobs-into-holes side-chain packing 
seen in the a and d layers (Figure 4b,c). Although the 
presence of one or two x—y-—y breaks in coiled-coil 
sequences has been observed in other viral fusion proteins 
(see Figure 4f) (52, 82, $3), the combined x- and y-like 
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FIGURE 4: Core packing in the N57 parallel coiled-coil trimer. (a) Lateral view of the trimer. Red van der Waals surfaces identify residues 
at the a positions, green surfaces identify residues at the d positions, blue surfaces identify residues at the x positions, and yellow surfaces 
identify residues at the y positions. (b) Cross-section of the trimer in the Val1018 (a) layer. The 2F,, — F, electron density map (contoured 
at 1.5 a) is shown with the refined molecular model. (c) Cross-section of the trimer in the Gln1021 (d) layer. The glutamine side chains 
form hydrogen bonds to a bound water molecule (red sphere) in the trimer core at distances ranging 2.61—2.96 A. Hydrogen bonds are 
shown as purple dotted lines. (d) The “x-like” packing of Val1007 pointing toward the 3-fold symmetry axis. (e) The “y-like” packing of 
Alal003 and Ile1004 with the alternating small and large side-chains facing inward to form a hydrophobic core. (f) Sequences of the 
HCoV-NL63 N57 and SARS-CoV N50 (52) peptides with the observed heptad-repeat positions. Amino acids at the a, d, x, and y positions 
are in boldface. The residues are numbered according to their position in the HCoV-NL63 and Urbani SARS-CoV S sequences, respectively. 


packing geometry spanning the entire nine helical turns has 
not been seen before. 

Interactions between the N57 and C42 Helices. The 
C-terminal 32 residues of C42 form a nine-turn amphipathic 
a-helix; amino acids 1244—1247 are in an extended con- 
formation (the two most N-terminal residues and the four 
most C-terminal residues are disordered). Each C42 chain 
packs into a hydrophobic groove formed by the interface of 
adjacent N57 helices, and there are no contacts between 
individual C42 chains (Figure 5a). Interestingly, the helical 
region of C42 intercalates into the groove on the outside of 
the flattened coiled-coil segment (residues 984— 1016; Figure 
5f). Side chains of the x and y residues in the coiled-coil 
core point into the triangular interhelical space between two 
N57 helices and a buttressing C42 helix (Figure 5b,c). 
Residues at position a of C42 pack against residues at the x 
position of the N57 trimer, and similarly, residues at positions 
d and e of C42 fix residues at the y positions of N57 (Figure 
Sf). This interfacial interaction utilizes a “ridges-into- 
grooves” packing mode found in globular proteins (84), in 
which the ridges are formed by the a, d, and e side chains 
of C42, and the grooves are lined with the x and y side chains 
of N57. Beyond buttressing the N-terminal a-helix, the 
extended peptide region of C42 fits into the adjoining groove 
of the N57 coiled coil, forcing the C-terminal end of C42 
into a rodlike structure (Figure 5a). The Leu1246 side chain 


forms van der Waals contacts with the successive layers 
formed by Ile1014 and Val1018 in the center of the coiled 
coil. 

The interaction of each C42 chain with the N57 trimer 
buries ~3164 A? of solvent-accessible surface area. As a 
result, the net hydrophobic stabilization energy estimated by 
the method of Eisenberg and McLachlan (85) is —43.5 kcal/ 
mol for the interaction. This result supports the view that 
the driving force for the N57/C42 complex formation is the 
hydrophobic packing between the outer-layer C42 helices 
and the interior coiled-coil trimer. Moreover, this interacting 
surface is interspersed with two polar residues from each 
N57 chain as well as three polar amino acids and two charged 
residues from each C42 chain (Figure 5f). These buried polar 
interactions, in conjunction with nonclassical coiled-coil core 
packing, appear to be critical in guiding formation of the 
six-helix bundle. For example, the side chain of Thr1000 at 
the x position of the coiled-coil trimer is oriented so as to 
allow its hydroxy] group to hydrogen-bond with the carbony] 
oxygen of Ala996 (y) of the same chain at distances ranging 
2.74—2.88 A, for example (Figure 5d). During the refinement 
of the structure, 8.6, 6.5, and 4.4 o peaks of electron density 
appeared near the N termini of C42 helices A, B, and C, 
respectively, which were modeled as sodium ions (see 
Materials and Methods). The side chain of Glul251 of each 
C42 helix coordinates to a sodium ion and forms a salt-bridge 
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chains drawn as an atomic model. The solvent-accessible surface is colored according to the local electrostatic potential, ranging from +12 
V in dark blue (most positive) to —19 V in deep red (most negative). (b) Cross-section of the N57(L10)C42 trimer showing that side-chain 
packing of Leu1262 (blue) at the a position of C42 against Thr1000 (green) at the x position of N57. The 2F,, — F, electron density map 
(contoured at 1.5 0) is shown with the refined molecular model. (c) Cross-section of the N57(L10)C42 trimer showing side-chain packing 
of Thr1265 and Thr1266 (blue) at the d and e positions of C42 against Ala996 and Ile997 (green) at the y positions of N57. (d) A buried 
hydrogen-bonding interaction between the side chain of Thr1000 (x) and the carbonyl group of Ala996 (y). Hydrogen bonds are shown as 
purple dotted lines. (e) A buried interhelical salt-bridge between Glu1251 (d) and Lys1013; the carboxylate group coordinates to a bound 
sodium ion (pink sphere). (f) Schematic representation of packing interactions between two N57 helices and a C42 helix. Packing interactions 
between residues at positions x and y of N57 and residues at positions a, d, and e of C42 are indicated by solid lines. The side chain of 


Leu1246 of C42 packs against Ile1014 (x) and Val1018 (a) of N57 (indicated by dotted lines). 


with Lys1013 (Oe—N6o = 2.79 A), and it is packed against 
Leul011 at the y position of the central coiled-coil trimer 
(Figure Se). All of the interfacial residues at the buried core 
positions of this N57/C42 complex are highly conserved 
among the group | S2 proteins (Figure 1b), presumably 
reflecting selective pressure on both trimeric coiled-coil 
interactions and specific interactions between the HR regions 
in membrane fusion. 

Comparison with Other Coronavirus S2 Protein Struc- 
tures. The N57/C42 complex shows structural similarity to 
soluble S2 cores from SARS-CoV and MHV (45-48, 52), 
although HCoV-NL63 S2 contains 14-residue in-phase 
insertions in both HR sequences (Figure 1b). The three six- 
helix bundles can be superimposed on each other with a root- 
mean-square (rms) deviation for the C, atoms of 1.5—4.1 A 
and with better superposition of the inner N-peptide coiled- 
coil core than the outer-layer C peptide (Figure 6). The 
trimeric coiled coil of MHV S2 has a regular 3—4 hydro- 
phobic periodicity (48); the HCoV-NL63 and SARS-CoV 
coiled coils are atypical because they contain heptad-repeat 
anomalies, generating the so-called “x layers” and “y layers”. 
Each of these layers has its own distinguishing geometry, 
but both display non-close-core packing of apolar residues. 
As in the HCoV-NL63 S2 core structure, the flattened coiled- 
coil segment comprising four successive alternating x and y 


layers provides a snug fit for intercalation of the C-terminal 
a-helical region of HR2. In addition, the C-terminal helices 
pack against a central N-terminal coiled coil with very 
different conformations in the three soluble S2 core struc- 
tures. 

Biological Implications. The structural and thermodynamic 
features of the proteolytically resistant S2 core from HCoV- 
NL63 presented here provide an opportunity to investigate 
several factors to be considered in the analysis of S 
glycoprotein-mediated virus entry. First, formation of the 
stable six-helix bundle in S2 drives the C-terminal trans- 
membrane anchor toward the fusion peptide, allowing the 
two membrane attachment points to come together in the 
trimer-of-hairpins structure (40—43). The N57/C42 structure 
is a trimer of helical hairpins and likely represents the 
fusogenic conformation of the HCoV-NL63 S2 protein. 
Although its overall architecture resembles those of the 
SARS-CoV and MHV S2 cores, the N57/C42 structure 
includes significant variations due to the presence of 14- 
residue insertions in both HR sequences. The six-helix bundle 
formed by the N57 and C42 peptides shows an extensive 
x—y—y-like coiled-coil packing interaction. As a conse- 
quence, the 32-residue a-helix of each C42 peptide inter- 
calates into the hydrophobic core of the central N-terminal 
coiled coil. By contrast, the SARS-CoV and MHV S2 cores 
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FIGURE 6: Comparison of the N57/C42 complex with the structures 
of the SARS-CoV and MHV S protein fusion cores. The figure 
shows superposition of main chain coordinates for the HCoV-NL63 
N57/C42 (red), SARS-CoV N50/C36 (1ZV8; green) and MHV 
2-Helix (IDWF; yellow) structures. Residues 901—949 and 1152— 
1182 of SARS-CoV S2 (52) and 970—1023 and 1216-1252 of 
MHV S2 (48) are included. The N termini of N helices point toward 
the top of the page, and those of C helices point toward the bottom. 
The positions of the 14-residue sequence insertions (blue) in HCoV- 
NL63 S2 are shown. The largest deviations occur at the N- and 
C-terminal ends of the C peptide regions in the six-helix bundle 
structures. 


demonstrate more regular core packing geometry, and the 
numbers of helical residues in their corresponding C-peptides 
are 19 and 22, respectively (45, 46, 48, 52). Thermal 
unfolding studies show that the N57/C42 complex is exceed- 
ingly stable, with a T,, value of 78 °C at 50 uM protein 
concentration in the presence of the denaturant GuHCl at 5 
M concentration. At lower denaturant concentrations, the 
soluble S2 cores from SARS-CoV and MHV are less stable 
(22, 49, 52, 86). The greater conformational stability of the 
N57/C42 complex can be ascribed to the large interfacial 
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contacts between the N- and C-peptide helices. This unusu- 
ally large binding energy may contribute to driving further 
conformational changes in the group 1 coronaviruses (see 
below). 

Second, we previously suggested that the HR regions, 
initially sequestered in the native S glycoprotein spike, are 
released and refold to promote membrane fusion (52). 
C-peptide fusion inhibitors are likely to act during this 
process by binding to the N-peptide triple-stranded coiled 
coil in a transient “prehairpin” intermediate, thereby disrupt- 
ing trimer-of-hairpins formation and blocking viral entry (40 
and references therein). In the coronaviruses, synthetic 
C-peptides inhibit entry of the HCoV-NL63, MHV, and 
SARS-CoV viruses with ICs) values of 0.5, 0.9, and 17.0 
UM, respectively (50, 55, 56). The high thermodynamic 
stability of the N57/C42 complex suggests that the exogenous 
C-peptides could exhibit a high binding affinity for the S2 
prehairpin intermediate. We should nonetheless emphasize 
that the atypical central coiled-coil conformation observed 
in the N57/C42 structure is likely to be unstable in isolation 
(see above). This property could therefore result in only 
transient exposure of the C-peptide binding site on the 
HCoV-NL63 S2 intermediate state and place temporal 
restrictions for C-peptide inhibition. Further analysis of the 
S2 activation process for a kinetic dependence of C-peptide 
inhibition may aid the development of improved HCoV- 
NL63 fusion inhibitors. 

Third, viral envelope glycoproteins that contain class I 
fusion moieties are typically synthesized as single-chain 
precursors and subsequently cleaved by a cellular protease 
to yield the canonical receptor-binding and transmembrane 
fusion subunits (40-44). This maturational cleavage is 
required for subsequent activation of the membrane fusion 
activity required for infectivity (44, 87). Interestingly, the S 
glycoproteins of the group | coronaviruses including HCoV- 
NL63 are not subjected to proteolytic cleavage during 
biogenesis (28—30). In contrast, the group 2 viruses such as 
MHV and the avian group 3 viruses express an S glycopro- 
tein that is cleaved into two noncovalently associated subunits 
(SI and S2) by furin-like enzymes during processing in the 
Golgi in the producer cell (22, 24—27). The stabilizing effects 
of the unique 14-residue insertions on the six-helix bundle 
structure of the group 1 viruses may offer a mechanistic 
explanation of this dimorphism in proteolytic maturation of 
the coronavirus S glycoprotein precursors. We propose that 
the highly favorable interactions between the extended HR 
regions of the group | viruses constitute a reservoir of free 
energy in the uncleaved S glycoprotein, primed to drive the 
S2 fusion reaction when triggered by viral interaction with 
the target cell. Lack of proteolytic cleavage in the group 1 
S glycoproteins might thus be evolutionarily linked to 
formation of the extremely stable trimer-of-hairpins structure. 
In this regard, it is noteworthy that the recently emerged 
SARS-CoV presents an exception to this model in expressing 
an uncleaved spike that lacks the HR insertion sequences 
(29, 30, 35, 39, 88). Recent studies suggest that entry by 
SARS-CoV is dependent on proteolysis of the S glycoprotein 
by cysteine protease cathepsin L in late endosomal or 
lysosomal compartment in the target cell (89). In contrast, 
HCoV-NL63 entry, which utilizes the same receptor, is not 
dependent on cathepsin activity (90), although the involve- 
ment of another host protease cannot be ruled out at present. 


Crystal Structure of the Human Coronavirus NL63 S2 Core 


Cleavage of the SARS-CoV S glycoprotein during endo- 
cytosis may enable the large-scale structural transitions 
required for membrane fusion. Thus, it is possible that the 
group | coronaviruses have evolved a different machinery 
to cooperatively link proteolytic processing and membrane 
fusion, potentially resulting in additional spatial or temporal 
regulation. The HCoV-NL63 S2 core structure serves as the 
starting point for addressing the role of proteolytic maturation 
in coronavirus membrane fusion and entry. 
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